A Partitioned Register File Architecture and Compilation Scheme 3 COMN
نویسندگان
چکیده
In Clustered Instruction-level Parallel (ILP) processors, the function units are partitioned and resources such as register le and cache are either partitioned or replicated and then grouped together into on-chip clusters. We present a novel partitioned register le architecture for clustered ILP processors which exploits the temporal locality of references to remote registers in a cluster and combines multiple inter-cluster communication operations into a single broadcast operation using a new sendb instruction. Our scheme makes use of a small Caching Register Buuer (CRB) attached to the traditional partitioned local register le, which is used to store copies of remote registers. We present an eecient code generation algorithm to schedule sendb operations on-they. Detailed experimental results show that a windowed CRB with just 4 entries provides the same performance as that of a partitioned register le with innnite non-architected register space for keeping remote registers.
منابع مشابه
A Register File Architecture and Compilation Scheme for Clustered ILP Processors
In Clustered Instruction-level Parallel (ILP) processors, the function units are partitioned and resources such as register file and cache are either partitioned or replicated and then grouped together into onchip clusters. We present a novel partitioned register file architecture for clustered ILP processors which exploits the temporal locality of references to remote registers in a cluster an...
متن کاملCompiler Supports and Optimizations for PAC VLIW DSP Processors
The Parallel Architecture Core (PAC) is a new VLIW DSP architecture, featuring a two cluster design, and partitioned, distributed register files with restricted access ports. Such an irregular processor poses many challenges in the construction of its compiler. This paper presents our work in providing the compilation support for PAC, based on the Open Research Compiler (ORC). We describe the d...
متن کاملLC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with irregular register files
Embedded processors developed within the past few years have employed novel hardware designs to reduce the ever-growing complexity, power dissipation, and die area. While using a distributed register file architecture with irregular accessing constraints is considered to have less read/write ports than using traditional unified register file structures, conventional compilation techniques can n...
متن کاملA Local-Conscious Global Register Allocator for VLIW DSP Processors with Distributed Register Files
Embedded processors developed in recent years have attempted to employ novel hardware design to reduce ever-growing complexity, power dissipation, and die area. While using a distributed register file architecture with irregular accessing constraints is considered to be an effective approach rather than traditional unified register file structures, conventional compilation techniques are not ad...
متن کاملLC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files
Embedded processors developed within the past few years have employed novel hardware designs to reduce the ever-growing complexity, power dissipation, and die area. Although using a distributed register file architecture is considered to have less read/write ports than using traditional unified register file structures, it presents challenges in compilation techniques to generate efficient code...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002